Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations
نویسندگان
چکیده
We study algorithms for online linear optimization in Hilbert spaces, focusing on the case where the player is unconstrained. We develop a novel characterization of a large class of minimax algorithms, recovering, and even improving, several previous results as immediate corollaries. Moreover, using our tools, we develop an algorithm that provides a regret bound ofO ( U √ T log(U √ T log T + 1) ) , where U is the L2 norm of an arbitrary comparator and both T and U are unknown to the player. This bound is optimal up to √ log log T terms. When T is known, we derive an algorithm with an optimal regret bound (up to constant factors). For both the known and unknown T case, a Normal approximation to the conditional value of the game proves to be the key analysis tool.
منابع مشابه
On Early Stopping in Gradient Descent Learning
In this paper, we study a family of gradient descent algorithms to approximate the regression function from Reproducing Kernel Hilbert Spaces (RKHSs), the family being characterized by a polynomial decreasing rate of step sizes (or learning rate). By solving a bias-variance trade-off we obtain an early stopping rule and some probabilistic upper bounds for the convergence of the algorithms. Thes...
متن کاملRegret bounds for Non Convex Quadratic Losses Online Learning over Reproducing Kernel Hilbert Spaces
We present several online algorithms with dimension-free regret bounds for general nonconvex quadratic losses by viewing them as functions in Reproducing Hilbert Kernel Spaces. In our work we adapt the Online Gradient Descent, Follow the Regularized Leader and the Conditional Gradient method meta algorithms for RKHS spaces and provide regret bounds in this setting. By analyzing them as algorith...
متن کاملCoin Betting and Parameter-Free Online Learning
In the recent years, a number of parameter-free algorithms have been developed for online linear optimization over Hilbert spaces and for learning with expert advice. These algorithms achieve optimal regret bounds that depend on the unknown competitors, without having to tune the learning rates with oracle choices. We present a new intuitive framework to design parameter-free algorithms based o...
متن کاملADAPTIVE FILTERING IN REPRODUCING KERNEL HILBERT SPACES By WEIFENG LIU A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy ADAPTIVE FILTERING IN REPRODUCING KERNEL HILBERT SPACES By Weifeng Liu December 2008 Chair: Jose C. Principe Major: Electrical and Computer Engineering The theory of linear adaptive filters has reached maturity, unlike the field of nonli...
متن کاملRegularized Policy Iteration with Nonparametric Function Spaces
We study two regularization-based approximate policy iteration algorithms, namely REGLSPI and REG-BRM, to solve reinforcement learning and planning problems in discounted Markov Decision Processes with large state and finite action spaces. The core of these algorithms are the regularized extensions of the Least-Squares Temporal Difference (LSTD) learning and Bellman Residual Minimization (BRM),...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014